Skip to content

Validation support and associated updates#59

Merged
malucius-rh merged 15 commits into
masterfrom
validation_support
Apr 8, 2026
Merged

Validation support and associated updates#59
malucius-rh merged 15 commits into
masterfrom
validation_support

Conversation

@malucius-rh
Copy link
Copy Markdown
Contributor

Description

Adds results file validation for auto and throughput modes
Adds tooling updates (test_tools download changes, calling of package_tool) which enable the validation updates and future work

Note: if running throughput mode in a non-default way, disable validation since validation currently does do dynamic results format

Before/After Comparison

Before: No validation of results files
After: Validation of results files

Clerical Stuff

This closes #58
Relates to JIRA: RPOPC-917
CSV & JSON files used in validation:
results_iozone.json.tput.txt
results_iozone.csv.tput.txt
results_iozone.json.auto.txt
results_iozone.csv.auto.txt

@malucius-rh malucius-rh requested a review from dvalinrh April 5, 2026 20:56
@malucius-rh malucius-rh self-assigned this Apr 5, 2026
@malucius-rh malucius-rh added the enhancement New feature or request label Apr 5, 2026
@qodo-code-review
Copy link
Copy Markdown

Review Summary by Qodo

Add results validation and refactor test tools installation

✨ Enhancement 🧪 Tests

Grey Divider

Walkthroughs

Description
• Implement results file validation for auto and throughput modes
• Refactor test tools installation using wget/curl/git fallback approach
• Move gather_data execution after tools installation completion
• Add Pydantic schema files for auto and throughput result validation
• Replace hardcoded tool paths with TOOLS_BIN environment variable
• Convert CSV results to JSON and verify against schema files
Diagram
flowchart LR
  A["Test Tools Installation"] -->|wget/curl/git| B["TOOLS_BIN Directory"]
  B --> C["Gather Hardware Data"]
  C --> D["Run IOzone Tests"]
  D --> E["Generate CSV Results"]
  E --> F["Convert to JSON"]
  F --> G["Validate Against Schema"]
  G -->|Pass/Fail| H["Archive Results"]
Loading

Grey Divider

File Changes

1. iozone/iozone_run.sh ✨ Enhancement +87/-29

Refactor tools installation and add results validation

• Refactored test tools installation with wget/curl/git fallback methods instead of direct git clone
• Introduced TOOLS_BIN environment variable to standardize tool paths throughout script
• Moved gather_data execution to after tools installation instead of early in script
• Removed --tools_git command-line argument parsing logic
• Added results validation sequence using csv_to_json and verify_results tools
• Updated all tool invocations to use ${TOOLS_BIN} variable prefix
• Added JSON results file preservation in archive alongside CSV
• Implemented dynamic header string and schema file selection based on test mode
• Changed error handling to use error code variables instead of hardcoded values

iozone/iozone_run.sh


2. iozone/results_iozone_auto_schema.py 🧪 Tests +34/-0

Define auto mode results validation schema

• Created Pydantic schema for auto mode IOzone results validation
• Defined Filesys enum with xfs, ext4, ext3 filesystem types
• Defined Testmode enum with directio, incache, incache_fsync, incache_mmap, outcache modes
• Specified Iozone_Results model with typed fields including performance metrics and timestamps
• Applied validation constraints (gt=0) for integer performance fields

iozone/results_iozone_auto_schema.py


3. iozone/results_iozone_tput_schema.py 🧪 Tests +40/-0

Define throughput mode results validation schema

• Created Pydantic schema for throughput mode IOzone results validation
• Defined Filesys and Testmode enums matching auto mode definitions
• Defined Operation enum with 13 IOzone operation types
• Specified Iozone_Results model with per-process performance fields (proc1, proc2, proc4)
• Applied validation constraints (gt=0, allow_inf_nan=False) for float performance fields

iozone/results_iozone_tput_schema.py


Grey Divider

Qodo Logo

@qodo-code-review
Copy link
Copy Markdown

qodo-code-review Bot commented Apr 5, 2026

Code Review by Qodo

🐞 Bugs (4)   📘 Rule violations (0)   📎 Requirement gaps (1)   🎨 UX Issues (0)
🐞\ ≡ Correctness (3) ☼ Reliability (1)
📎\ ≡ Correctness (1)

Grey Divider


Action required

1. exit out breaks error handling📎
Description
When csv_to_json fails, the script calls exit out ... instead of exit_out, so the failure may
not be reported with the intended actionable message and consistent exit code. This can cause
malformed/unvalidated results handling to fail in a non-compliant, non-actionable way.
Code

iozone/iozone_run.sh[R1877-1879]

+if [[ $rtc -ne 0 ]]; then
+	exit out "Error: csv_to_json failed" $E_GENERAL
+fi
Evidence
PR Compliance ID 1 requires validation failures be clearly flagged with actionable errors; the new
validation path’s error branch uses an invalid exit invocation (exit out) rather than the
wrapper’s exit_out, which prevents the intended explicit error reporting on conversion failure.

Validate IOzone results format using Pydantic
iozone/iozone_run.sh[1872-1880]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
The new validation sequence attempts to exit on `csv_to_json` failure using `exit out ...`, which is not the defined `exit_out` helper. This can suppress the intended actionable error message and produce inconsistent exit behavior.
## Issue Context
This is part of the newly-added results validation flow; failures here must be clearly flagged per the validation compliance requirement.
## Fix Focus Areas
- iozone/iozone_run.sh[1875-1879]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


2. Proc schema mismatches header 🐞
Description
Throughput validation selects a fixed schema requiring proc1/proc2/proc4, but the CSV header is
generated from file_count_list (default is only "1"). This makes default throughput runs fail
validation and also fails for any file_count_list not exactly 1,2,4.
Code

iozone/results_iozone_tput_schema.py[R31-39]

+class Iozone_Results (pydantic.BaseModel):
+	fs: Filesys
+	mode: Testmode
+	op: Operation
+	proc1: float = pydantic.Field(gt=0, allow_inf_nan=False)
+	proc2: float = pydantic.Field(gt=0, allow_inf_nan=False)
+	proc4: float = pydantic.Field(gt=0, allow_inf_nan=False)
+	Start_Date: datetime.datetime
+	End_Date: datetime.datetime
Evidence
The script defaults file_count_list to "1" and builds results_header_str from it, so the
generated CSV/JSON will only contain proc1 by default. The throughput schema hard-requires proc1,
proc2, and proc4, so verify_results cannot succeed unless the header/data matches those exact
fields.

iozone/iozone_run.sh[74-76]
iozone/iozone_run.sh[1744-1751]
iozone/results_iozone_tput_schema.py[31-39]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
Throughput validation currently cannot pass with the default configuration because the generated header (from `file_count_list`) does not match the fixed Pydantic schema fields (`proc1`, `proc2`, `proc4`).
### Issue Context
- Default: `file_count_list="1"`.
- Header generation uses `file_count_list` to build `proc*` columns.
- Schema requires proc1/proc2/proc4 regardless of header.
### Fix Focus Areas
Pick one:
1) Make validation conditional (skip or select a matching schema) unless `file_count_list` is exactly `1,2,4`.
2) Change default `file_count_list` for throughput to `1,2,4` and enforce it when validation is enabled.
3) Update the throughput schema to handle a variable set of `proc*` fields (e.g., optional fields + `extra='allow'`, or a model that validates a dict of proc values).
### Fix Focus Areas
- iozone/iozone_run.sh[74-76]
- iozone/iozone_run.sh[1744-1752]
- iozone/results_iozone_tput_schema.py[31-39]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


3. Throughput results glob mismatch 🐞
Description
Throughput result filenames use mode strings like "incache+fsync" and "incache+mmap", but
reduce_non_auto_data searches for files with "incache_fsync" and "incache_mmap" in the filename
pattern. This can prevent finding/parsing throughput outputs and leads to empty/partial CSV/JSON
(and subsequent validation failure).
Code

iozone/iozone_run.sh[1]

fi
Evidence
The wrapper generates throughput test outputs with runtest_name values containing + (e.g.,
incache+fsync), and do_test_actual embeds that string in the output filename. The reducer loops
over underscore variants and uses compgen -G *${testmode}_*.iozone, which won’t match files named
with incache+fsync_default etc.

iozone/iozone_run.sh[1006-1053]
iozone/iozone_run.sh[1444-1467]
iozone/iozone_run.sh[632-636]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`reduce_non_auto_data()` uses different mode tokens than the ones used to name throughput result files, so it may not discover any `*.iozone` outputs to reduce.
### Issue Context
- Throughput mode calls `do_test ... "incache+fsync"` and `do_test ... "incache+mmap"`.
- Output filenames include `${runtest_name}_${one_run}`.
- Reducer searches for `incache_fsync` / `incache_mmap` / `outofcache`.
### Fix Focus Areas
- Update `reduce_non_auto_data` to iterate over the actual `runtest_name` strings used by `do_test` (e.g., `incache`, `incache+fsync`, `incache+mmap`, `directio`, `outcache`) and adjust the glob accordingly.
### Fix Focus Areas
- iozone/iozone_run.sh[1006-1053]
- iozone/iozone_run.sh[1444-1467]
- iozone/iozone_run.sh[632-636]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools



Remediation recommended

4. Schemas ignore extra fields 📎
Description
The new Pydantic models do not configure extra/unknown fields handling, so unexpected columns/keys
may be silently accepted instead of being rejected as malformed results. This weakens the schema
validation guarantee required by the compliance checklist.
Code

iozone/results_iozone_auto_schema.py[R16-34]

+class Iozone_Results (pydantic.BaseModel):
+	fs: Filesys
+	mode: Testmode
+	all_ios: int = pydantic.Field(gt=0)
+	initwrite: int = pydantic.Field(gt=0)
+	rewrite: int = pydantic.Field(gt=0)
+	read: int = pydantic.Field(gt=0)
+	reread: int = pydantic.Field(gt=0)
+	rndread: int = pydantic.Field(gt=0)
+	rndwrite: int = pydantic.Field(gt=0)
+	backread: int = pydantic.Field(gt=0)
+	recrewrite: int = pydantic.Field(gt=0)
+	strideread: int = pydantic.Field(gt=0)
+	fwrite: float = pydantic.Field()
+	frewrite: float = pydantic.Field()
+	fread: float = pydantic.Field()
+	freread: float = pydantic.Field()
+	Start_Date: datetime.datetime
+	End_Date: datetime.datetime
Evidence
PR Compliance ID 1 requires malformed results be rejected/flagged via schema validation; the added
Pydantic models define fields but include no configuration to forbid extra fields, which can allow
structurally unexpected data to pass validation unnoticed.

Validate IOzone results format using Pydantic
iozone/results_iozone_auto_schema.py[16-34]
iozone/results_iozone_tput_schema.py[31-39]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
The Pydantic schemas do not explicitly forbid extra/unknown fields, which can allow malformed/extended results payloads to pass validation.
## Issue Context
This PR introduces Pydantic validation for IOzone results; to ensure only correctly structured results proceed, schemas should reject unexpected keys/columns.
## Fix Focus Areas
- iozone/results_iozone_auto_schema.py[16-34]
- iozone/results_iozone_tput_schema.py[31-39]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


5. Filesystem enum too strict 🐞
Description
The validation schemas only allow fs values {xfs, ext4, ext3}, but the wrapper advertises gfs/gfs2
support via --filesystems. Runs using gfs/gfs2 will fail verification even if the benchmark run
succeeds.
Code

iozone/results_iozone_auto_schema.py[R4-8]

+class Filesys(Enum):
+	xfs = "xfs"
+	ext4 = "ext4"
+	ext3 = "ext3"
+
Evidence
The wrapper’s usage text documents gfs/gfs2 as supported filesystem options, but both schemas
constrain the filesystem enum to xfs/ext4/ext3 only, guaranteeing validation failure for those
documented modes.

iozone/iozone_run.sh[194-197]
iozone/results_iozone_auto_schema.py[4-8]
iozone/results_iozone_tput_schema.py[4-8]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
Schemas reject filesystem values that the wrapper claims to support (e.g., gfs/gfs2), causing validation failures for documented configurations.
### Issue Context
The wrapper documents `--filesystems <xfs,ext4,ext3,gfs,gfs2>`, but the schema enum does not include gfs/gfs2.
### Fix Focus Areas
- Add `gfs` and `gfs2` enum values (and any other supported fs types), or replace the enum with a constrained string validator that matches the wrapper’s supported list.
### Fix Focus Areas
- iozone/iozone_run.sh[194-197]
- iozone/results_iozone_auto_schema.py[4-8]
- iozone/results_iozone_tput_schema.py[4-8]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


6. Tools install not validated 🐞
Description
After attempting wget/curl/git installation of TOOLS_BIN, the script unconditionally runs
${TOOLS_BIN}/gather_data without verifying that TOOLS_BIN was successfully created. If all install
attempts fail (or unzip/mv fails), the script continues into confusing “file not found” failures.
Code

iozone/iozone_run.sh[R317-324]

+attempt_tools_wget
+attempt_tools_curl
+attempt_tools_git
+
+# End of test tools installation
+
+# Gather hardware information
+${TOOLS_BIN}/gather_data ${curdir}
Evidence
The installer functions only create TOOLS_BIN on the happy path; failures leave TOOLS_BIN absent.
Immediately after the attempts, the script executes ${TOOLS_BIN}/gather_data without checking for
existence/executability.

iozone/iozone_run.sh[285-325]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
Tool bootstrap tries multiple download methods but never asserts success before using `${TOOLS_BIN}`.
### Issue Context
If wget/curl downloads fail or unzip/mv fails, `${TOOLS_BIN}` may not exist, yet the script calls `${TOOLS_BIN}/gather_data` and later `source ${TOOLS_BIN}/general_setup`.
### Fix Focus Areas
- Add a final guard after `attempt_tools_*` calls, e.g.:
- `if [[ ! -x "${TOOLS_BIN}/gather_data" ]]; then exit_out "Error: failed to install test_tools into ${TOOLS_BIN}" 101; fi`
- Optionally clean up partial `main.zip` / extracted dirs on failures.
### Fix Focus Areas
- iozone/iozone_run.sh[285-325]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


Grey Divider

ⓘ The new review experience is currently in Beta. Learn more

Grey Divider

Qodo Logo

Comment thread iozone/iozone_run.sh
Comment thread iozone/results_iozone_tput_schema.py
Copy link
Copy Markdown
Contributor

@dvalinrh dvalinrh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing I found, please address al what qodo found.

Comment thread iozone/iozone_run.sh Outdated
@malucius-rh malucius-rh requested a review from dvalinrh April 7, 2026 14:19
Copy link
Copy Markdown
Contributor

@dvalinrh dvalinrh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@dvalinrh dvalinrh added the group_review_lgtm Indicates approval after a group review meeting label Apr 8, 2026
@malucius-rh malucius-rh merged commit 76b93c9 into master Apr 8, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request group_review_lgtm Indicates approval after a group review meeting

Projects

None yet

Development

Successfully merging this pull request may close these issues.

CPT: IOzone needs results format validation added

2 participants